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INTRODUCTION 

For  sample  sizes  up  to  n  =  50  an  ordered  sample  (X^.-.X^  may  be 
tested  for  normality  by  a  calculation  involving  a  vector  (V^ . . • Vn)  that 
is  tabulated  in  (l) .  If  V  and  X  are  column  vectors  the  test  statistic, 

t 


also  tabulated  in  (l),  is 


W  = 


rjle  c  /j  f'V  S  if  M  e  *** 


1  —  1  *  •  •  n 


(x,-x)2 


The  distribution  of  W  ranges  from  0  to  1  where  as  W  approaches  1.0  the 
distribution  of  (Xi*..Xn)  comes  closer  and  closer  to  being  Gaussian# 

When  dealing  with  a  very  large  sample,  N  measurements,  how  does 
one  proceed  to  test  for  normality  since  V  for  samples  greater  than  50 
is  not  available?  A  common  procedure  is  the  chi  squared  test  for 
deviations  from  an  expected  distribution.  A  more  attractive  but  approxi- 
nate  method,  outlined  below,  is  the  taking  of  a  sample  of  n  =  50  from 
the  cross  section  of  the  large  sample  of  N  values.  These  50  values  may 
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then  be  tested  by  (V^...V^q)  from  the  table  in  (l). 

METHOD 

The  procedure  for  obtaining  n  =  50  values  from  N  values  is: 

The  ordered  value  of  the  sample  (X^ . .  •x50) 

equals  the  (NP^  +  5^0  th  ordered  value  of  sample  (X^. .  ,X^) 

rounding  to  the  nearest  integral  value. 

The  vector  P  =  (P1...P50)  is  obtained  from  (2)  using  50  percent  ranks 
with  the  tails  of  the  distribution  adjusted  outward  to  give  a  standard 
deviation  for  (X^...X^q)  close  to  that  of  the  large  sample  (X^...Xjj) 
and  with  W  close  to  1.  The  vector  P  is: 

P  =  .0048, .0322, .0531, .0729, .0928, .1126, .1325, .1524, .1722, .1921, .2119, 

.2318, .2517, .2715, .2914, .3113, .3311,-3510, .3709,-3907, .4106, .4305, 
. 4503, . 4702, . 4901, . 5099, . 5298, . 5497, . 5695, . 5894, . 6093 , . 6291 , . 6490, 
.6689,-6887, .7086, .7285, .7483,-7682, .7880, .8079, .8278, .8476, .8675, 
.8873, .9072, .9270, .9469, .9678, .9952 

If  N  =  10,000  the  required  set  (X^...X^q)  equals  the  (  49th,  323rd, 
532nd  ..  .9952nd)  ordered  values  of  the  sample  (X^...X^q  qqq)* 

The  vector  (V1...V50)  =  (-.3751, _ .0035, .0035, ...» .3751)  and  W 

would  be  calculated  as  usual.  From  tables  of  the  unit  normal 
distribution  a  cross-section  sample  of  50  yielded  a  sigma  of  1.004 
(for  50  degrees  of  freedom)  and  a  W  of  .9996,  both  sufficiently  close 
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to  unity  for  general  application. 

SUMMARY 

An  alternate  method  has  been  outlined  to  test  if  a  large  sample 
is  Gaussian  in  distribution.  Instead  of  a  chi  squared  test  of  fit  a 
new  statistic  W  is  evaluated  using  a  cross-section  sample  of  50  from  a 
much  larger  sample  of  data.  If  the  large  sample  is  at  least  100,  the 
technique  yields  reliable  results  which  may  be  assessed  for  significance 
against  tabulated  percentiles  of  W. ^ 

JOHN  SKORY 
Math  Statistician 
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